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AB Motifer is a software tool able to find directly in nucleotide databases 
very distant homologues to an amino acid query sequence. It focuses 
searches on a specific amino acid pattern, scoring the matching and 
intervening residues as specified by the user. The program has been 
developed for searching databases of expressed sequence tags 
(ESTs), but it is also well suited to search genomic sequences. The query 
sequence can be a variable pattern with alternative amino acids or gaps 
and the sequences searched can contain introns or sequencing errors with 
accompanying frame shifts. Other features include options to generate a 
searchable output, set the maximal sequencing error frequency, limit 
searches to given species, or exclude already known matches. Motifer can 
find sequence homologs that other search algorithms would deem unrelated 
or would not find because of sequencing errors or a too large no. of other 
homologs. The ability of Motifer to find relatives to a given sequence is 
exemplified by searches for members of the transforming growth 
factor-.beta. family and for proteins contg. a WW-domain. The functions 
aimed at enhancing EST searches are illustrated by the *in silico' cloning 
of a novel cytochrome P 450 enzyme. 
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AB The selection and modification of atoms or functional groups underly many 
of the manipulations central to mol. modeling. It has become even more 
important to automate these tasks with the current prevalence of work with 
large databases or mols. SUPER-SMILES, a conceptually simple set of 
extensions to SMILES line notation, whose key features are addn. and 
deletion facilities, macros, atom tagging, disjunctions, and 
constraints, has been developed. This superset of SMILES enables the 
performance of transformations on individual mol. structures or across 
members of a database with a pattern-matching protocol. The principal advantage of 
SUPER-SMILES is the ability to specify chem. reactions with a very simple 
augmentation of the SMILES line notation. SUPER-SMILES represents a unified 
approach to mol. structure specification and modification and can easily be applied to 
large datasets of mols. This functionality has been implemented within the 
PROMETHEUS suite of CAMD programs. Its use in performing such operations as 
atom-type assignment, protonation of mols., valency checking, and hydrogen addn., is 
demonstrated. 
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AB To assemble a catalog of proteins expressed in mammal photoreceptor cells, 
following exptl. strategy was carried out: the prepn. of bovine 
photoreceptor cell monolayer (PCL) and the methodol. for analyzing 
proteins on 2-D gels and mass spectrometry. Trypsin digested peptides 
were sepd. by HPLC and analyzed by a matrix- assisted laser 



desorption/ionization time-of-flight (MALDI-TOF) mass spectrometer. Some 
peptides were sequenced by Edman degrdn To identify the proteins, each 
peptide-mass data set was referred to online peptide-mass 

fingerprinting database. To standardize the 2-D gel profile, we detd. the pi and mol. 
wt. coordinates of some of the major proteins in the photoreceptor cells by running 
internal proteins marker and by considering theor. values based on the known sequences. 
Fourteen major 2-D gel spots were identified and listed to our catalog. 
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AB An efficient and easy to use query system for a gene expression database 
is described. Using such a system, one can easily identify genes or 
expressed sequence tags whose expression correlates to 



particular tissue types. Various tissue types may correspond to different 
diseases "states of disease progression, different organs, different 
species, etc. Researchers may now use large scale gene expression 
databases to full advantage, 
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AB A review, with 35 refs. Research and development involving the use of 
cell lines require precise knowledge of the purity and species of origin 
of the cell lines used. This can only be assured by periodic monitoring 
of cultured cell lines for possible contamination by other cells and for 
characteristics that authenticate the cell line identity. In the absence 
of such monitoring, inter- and intraspecies cell line contaminations are 
likely to occur in the labs, of unsuspecting investigators and can result 
in the generation of mistaken conclusions with an attendant loss of 
investigators* time, effort, and resources. This chapter provides a 
history and an overview of the methods that have been developed for cell 
line authentication, the type of information each of these different 
methods provides, and how synthesis of that information can be used to 
characterize a cell line and confirm its identity. An effective cell line 
monitoring strategy is described that involves testing for a combination 
of genetic markers, including cell membrane species antigens, 
isoenzymes, chromosomes, and DNA fingerprints, and use of 
databases for each marker system to compare the results 
obtained with a test cell culture with results from an extensive panel of 
previously tested cell lines, (c) 1998 Academic Press. 
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AB We describe the coupling of a microfabricated fluidic device to an 
electrospray ionization (ESI) quadrupole time-of-flight mass spectrometer 
(QqTOFMS) for the identification of protein samples. The microfabricated 
devices consisted of three reservoirs connected via channels to a main 
capillary, which in turn was linked via a microspray interface to the 
QqTOFMS. Here we present preliminary results obtained using this system. 
Standardized solns. of myoglobin tryptic digest were analyzed indicating a 
limit of detection at the low to sub fmol/plL. The combination of the 
microfabricated device for rapid sample delivery and the rapid acquisition 
capability, enhanced resoln. and mass accuracy of the QqTOF offers unique 
possibilities for the rapid identification of proteins by database 
searching. This platform can generate MS data suitable for protein 
database searching by the peptide-mass fingerprinting 
approach and MS/MS data suitable for protein database searching. Here the 
results of the two database-searching approaches are compared and the 
possibilities of combining the two approaches for rapid identification of 
protein are discussed. Also, we present a comparison of the results 
obtained using the three-position microfabricated device coupled to the 
ESI-QqTOFMS and to an ESI-ion trap MS. Finally the combination of 
C-terminal 180 labeling of peptides and the microfabricated system for 
automated combined peptide-mass fingerprinting and sequence- 
tag database searching is discussed. 
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AB Ninety genotypically characterized Aeromonas strains, including members of 



all 14 currently established genospecies, were studied by performing 
gas-liq. chromatog. anal of their cellular fatty acid Me esters (FAMEs). 
A total of 44 fatty acids and two ales, were found in members of the genus 
Aeromonas. All 90 strains contained 12:0, 13:0 iso, 14:0, 15:0 iso 30H, 
16:0, 16:1 .omega.7c, 17:0 iso, iso 17:1 .omega.9c, summed feature 3 (16:1 
iso I and/or 14:0 30H), and summed feature 7 (18:1 .omega.7c, 18:1 
.omega.9t, and/or 18:1 .omega. 12t), whereas all but one strain (99%) also 
contained 15:0 iso. Although the FAME profiles were very similar, minor 
quant, variations could be used to differentiate phenospecies and/or 
hybridization groups. A cluster anal, of the mean data revealed five FAME 
clusters, which were compared with phenotypic and genotypic groups 
identified in the genus Aeromonas. Hybridization groups that constituted 
the Aeromonas hydrophila complex, the Aeromonas caviae complex, and the 
Aeromonas sobria complex were basically grouped into distinct FAME 
clusters. The taxonomic positions of hybridization groups 7 and 1 1 in 
these clusters, however, remained unclear. All of our results were highly 
reproducible. A new database of Aeromonas FAME fingerprints was generated, and 
this database can be used for rapid identification of unknown aeromonads. Using a 
large set of well-characterized aeromonads, we demonstrated for the first time that gas- 
liq. chromatog. FAME anal, can be used to differentiate the majority of the 
phenospecies and/or hybridization groups in the genus Aeromonas. 
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AB Proteins can be identified using a set of peptide fragment wts. produced 
by a specific digestion to search a protein database in which sequences 
have been replaced by fragment wts. calcd, for various cleavage methods. 
We present a method using muhidimensional searches that greatly increases 
the confidence level for identification, allowing DNA sequence databases 
to be examd. This method provides a link between 2-dimensional gel 
electrophoresis protein databases and genome sequencing projects. 
Moreover, the increased confidence level allows unknown proteins to be 
matched to expressed sequence tags, potentially eliminating the 
need to obtain sequence information for cloning. Database searching from 
a mass profile is offered as a fi*ee service by an automatic server at the 
Swiss Federal Institute of Technol. (ETH), Zurich. For information, send 
an electronic message to the address cbrg@inf ethzxh with the line: help 



mass search, or help all. 
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AB The authors developed an algorithm for identifying proteins at the 
submicrogram level without sequence detn. by chem. degrdn. The protein, 
usually isolated by 1- or 2-dimensional gel electrophoresis, is digested 
by enzymic or chem. means and the masses of the resulting peptides are 
detd. by mass spectrometry. The resulting mass profile, i.e., the list of 
the mol. masses of peptides produced by the digestion, serves as a 
fingerprint which uniquely defines a particular protein. This 
fingerprint may be used to search the database of known 
sequences to find proteins with a similar profile. If the protein is not 
yet sequenced the profile can serve as a unique marker. This 
provides a rapid and sensitive link between genomic sequences and 2D gel 
electrophoresis mapping of cellular proteins. 



